Modular neural networks exploit large acoustic context through broad-class posteriors for continuous speech recognition

نویسنده

  • Christos Antoniou
چکیده

Traditionally, neural networks such as multi-layer perceptrons handle acoustic context by increasing the dimensionality of the observation vector, in order to include information of the neighbouring acoustic vectors, on either side of the current frame. As a result the monolithic network is trained on a high multi-dimensional space. The trend is to use the same fixed-size observation vector across the one network that estimates the posterior probabilities for all phones, simultaneously. We propose a decomposition of the network into modular components, where each component estimates a phone posterior. The size of the observation vector we use, is not fixed across the modularised networks, but rather accounts for the phone that each network is trained to classify. For each observation vector, we estimate very large acoustic context through broad-class posteriors. The use of the broad-class posteriors along with the phone posteriors greatly enhance acoustic modelling. We report significant improvements in phone classification and word recognition on the TIMIT corpus. Our results are also better than the best context-dependent system in the literature.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hidden neural networks: application to speech recognition

In this paper we evaluate the Hidden Neural Network HMM/NN hybrid presented at last years ICASSP on two speech recognition benchmark tasks; 1) task independent isolated word recognition on the PHONEBOOK database, and 2) recognition of broad phoneme classes in continuous speech from the TIMIT database. It is shown how Hidden Neural Networks (HNNs) with much fewer parameters than conventional HMM...

متن کامل

The Use of Recurrent Neural Networks in Continuous Speech Recognition

This chapter was written in 1994. Further advances have been made such as: context-dependent phone modelling; forward-backward training and adaptation using linear input transformations. This chapter describes a use of recurrent neural networks (i.e., feedback is incorporated in the computation) as an acoustic model for continuous speech recognition. The form of the recurrent neural network is ...

متن کامل

Hierarchies of neural networks for connectionist speech recognition

We present a principled framework for context-dependent hierarchical connectionist HMM speech recognition. Based on a divideand-conquer strategy, our approach uses an Agglomerative Clustering algorithm based on Information Divergence (ACID) to automatically design a soft classi er tree for an arbitrary large number of HMM states. Nodes in the classi er tree are instantiated with small estimator...

متن کامل

A hybrid SVM/HMM acoustic modeling approach to automatic speech recognition

Acoustic models based on a NN/HMM framework have been used successfully on various recognition tasks for continuous speech recognition. Recently tied-posteriors have been introduced within this context. Here, we present an approach combining SVMs and HMMs using the tied-posteriors idea. One set of SVMs calculates class posterior probabilities and shares these probabilities among all HMMs. The n...

متن کامل

Modular combination of deep neural networks for acoustic modeling

In this work, we propose a modular combination of two popular applications of neural networks to large-vocabulary continuous speech recognition. First, a deep neural network is trained to extract bottleneck features from frames of mel scale filterbank coefficients. In a similar way as is usually done for GMM/HMM systems, this network is then applied as a nonlinear discriminative feature-space t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001